• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction
¿µ¹®Á¦¸ñ(English Title) Question Similarity Measurement of Chinese Crop Diseases and Insect Pests Based on Mixed Information Extraction
ÀúÀÚ(Author) Han Zhou   Xuchao Guo   Chengqi Liu   Zhan Tang   Shuhan Lu   Lin Li  
¿ø¹®¼ö·Ïó(Citation) VOL 15 NO. 11 PP. 3991 ~ 4010 (2021. 11)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
The Question Similarity Measurement of Chinese Crop Diseases and Insect Pests (QSM-CCD&IP) aims to judge the user¡¯s tendency to ask questions regarding input problems. The measurement is the basis of the Agricultural Knowledge Question and Answering (Q & A) system, information retrieval, and other tasks. However, the corpus and measurement methods available in this field have some deficiencies. In addition, error propagation may occur when the word boundary features and local context information are ignored when the general method embeds sentences. Hence, these factors make the task challenging. To solve the above problems and tackle the Question Similarity Measurement task in this work, a corpus on Chinese crop diseases and insect pests (CCDIP), which contains 13 categories, was established. Then, taking the CCDIP as the research object, this study proposes a Chinese agricultural text similarity matching model, namely, the AgrCQS. This model is based on mixed information extraction. Specifically, the hybrid embedding layer can enrich character information and improve the recognition ability of the model on the word boundary. The multi-scale local information can be extracted by multi-core convolutional neural network based on multi-weight (MM-CNN). The self-attention mechanism can enhance the fusion ability of the model on global information. In this research, the performance of the AgrCQS on the CCDIP is verified, and three benchmark datasets, namely, AFQMC, LCQMC, and BQ, are used. The accuracy rates are 93.92%, 74.42%, 86.35%, and 83.05%, respectively, which are higher than that of baseline systems without using any external knowledge. Additionally, the proposed method module can be extracted separately and applied to other models, thus providing reference for related research.
Å°¿öµå(Keyword) Text semantic similarity   Short text-similarity   Agricultural natural language processing   Chinese word segmentation  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå